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BACKGROUND OF THE INVENTION 

Field of the Invention 

[00021 The present invention is directed to receivers and, more particularly, to 

digital signal processing ("DSP") based receivers, and more particularly still, 
to high speed multi-path analog-to-digital converters ("ADCs") and high data 
rate multi-path DSPs. 

Related Art 

[0003] There is an ever-increasing need for higher speed communications 

systems. In order to reduce costs, communications systems are increasingly 
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implemented using Very Large Scale Integration (VLSI) techniques. The 
level of integration of communications systems is constantly increasing to take 
advantage of advances in integrated circuit manufacturing technology and the 
resulting cost reductions. This means that communications systems of higher 
and higher complexity are being implemented in a smaller and smaller number 
of integrated circuits. For reasons of cost and density of integration, the 
preferred technology is CMOS. 
[0004] Digital Signal Processing ("DSP") techniques generally allow higher 

levels of complexity and easier scaling to finer geometry technologies than 
analog techniques, as well as superior testability and manufacturability. 
However, DSP based communications systems require, for their 
implementation, an analog-to-digital converter ("ADC"). In many 
applications, the ADC is challenging to design. In the extreme, the ADC 
requirements sometimes limit the practicality of building DSP-based 
communications systems. One such case occurs when the speed of the 
communication system is very high, for example in the multi-gigabit per 
second range. 

[0005] There is growing demand for communications systems that operate at 

data rates in the multi-gigabit per second range. Examples of such systems are 
transceivers for optical communications for standards such as OC-48, OC-192, 
and OC-768, 10 gigabit Ethernet, Fibre Channel, etc. Another example is a 
transmission system where the communication channel is a transmission line 
on a printed circuit ("PC") board. These communications systems typically 
operate over short distances and they are used to interconnect chips on a PC 
board or on different PC boards across a back plane in a rack-based system. 
These systems typically operate at data rates of several gigabits per second, 
and there is a need to increase the speed to the limits allowed by the 
technology. Additional examples include: transmission systems operating 
over short lengths of coaxial, twisted pair, or twin-ax cable; and very short 
reach ("VSR") applications, such as from one equipment rack to another. 
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[0006] Conventional communications systems have limited ADC speeds and 

limited digital signal processing speeds. Therefore, there is a need for methods 
and systems for high speed analog-to-digital conversion and for high speed 
digital signal processing. 

BRIEF SUMMARY OF THE INVENTION 

[0007] The present invention is directed to receivers and, more particularly, to 

digital signal processing ("DSP") based receivers, high speed multi-path 
analog-to-digital converters ("ADCs"), and high data rate multi-path DSPs. 
Aspects of the present invention include, among other things, and without 
limitation, coding and error correcting schemes, timing recovery schemes, 
and equalization schemes. 

[0008] In an embodiment, the present invention is implemented as a multi- 

path parallel receiver in which an analog-to-digital converter ("ADC") and/or 
a digital signal processor ("DSP") are implemented with parallel paths that 
operate at lower rates than the received data signal. In an embodiment, a 
receiver ADC is configured with N parallel paths and a receiver DSP is 
configured with M parallel paths, where M = kN, wherein k is an integer or a 
number of the form 1/s, where s is an integer. In an embodiment, the parallel 
ADC paths are operated in an interleaved fashion. In parallel 
implementations, one or more DSP and/or analog processes, including, 
without limitation, one or more processes that compensate for nonidealities in 
the analog front-end paths, can be performed on a per path basis, as described 
below. 

[0009] In an embodiment, a parallel DSP-based receiver in accordance with 

the invention includes a separate timing recovery loop for each ADC path. 
The separate timing recovery loops can be used to compensate for timing 
phase errors in the clock generation circuit that are different for each path. In 
an embodiment, phase compensation is performed with a phase interpolator or 
phase selector. 
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[0010] In an embodiment, a parallel DSP-based receiver in accordance with 

the invention includes a separate automatic gain control (AGC) loop for each 
ADC path. The separate AGC loops can be used to compensate for gain errors 
on a path-by-path basis. 
[0011] In an embodiment, a parallel DSP-based receiver in accordance with 

the invention includes a separate offset compensation loop for each ADC path. 
The separate offset compensation loops can be used to independently 
compensate for offsets that are different for each path. 
[0012] In accordance with the invention, one or more adaptive processes are 

implemented to correct for ADC impairments. For example, one or more 
processes, such as timing recovery, phase error correction, gain error 
correction, offset compensation, and/or equalization, are implemented as 
adaptive processes and/or systems that adapt to reduce error. Error is used in 
one or more feedback loops, for example, to generate equalizer coefficients, to 
optimize ADC sampling phase(s) for timing recovery, and/or to optimize gain 
for automatic gain control ("AGC")- Error correction can be used for other 
processes as well. 

[0013] Error can be computed in one or more of a variety of ways. For 

example, error can be computed as a difference between input signals and 
decisions as to the values of the input signals. This is referred to herein as a 
decision-directed process. Decision-directed processes can be implemented 
with a slicer. Alternatively, decision-directed processes can be implemented 
with a Viterbi Decoder. Other decision-directed processes can be used as 
well. Other error determination processes can also be used. 

[0014] Examples are provided herein, which typically illustrate timing 

recovery, AGC, and offset cancellation algorithms as decision-directed 
processes, where error is computed at a slicer or equivalent decision device, 
such as Viterbi decoder. The examples are provided for illustrative purposes 
and are not limiting. Based on the teachings herein, one skilled in the relevant 
art(s) will understand that the techniques can be implemented with non- 
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decision-directed processes as well, and/or in combinations of decision- 
directed and non-decision-directed processes. 

[0015] In an embodiment the present invention is implemented as a multi- 

channel receiver that receives a plurality of data signals. 

[0016] In accordance with aspects of the invention, one or more of the 

following types of equalization are performed, alone and/or in various 
combinations with one another: 

[0017] Viterbi equalization; 

[0018] feed-forward equalization ("FEE"); and/or 

[0019] decision feed-back equalization ("DFE"). 

[0020] Further features and advantages of the invention, as well as the 

structure and operation of various embodiments of the invention, are described 
in detail below with reference to the accompanying drawings. It is noted that 
the invention is not limited to the specific embodiments described herein. 
Such embodiments are presented herein for illustrative purposes only. 
Additional embodiments will be apparent to persons skilled in the relevant 
art(s) based on the teachings contained herein. 

BRIEF DESCRIPTION OF THE FIGURES 

[0021] The present invention will be described with reference to the 

accompanying drawings. The drawing in which an element first appears is 
typically indicated by the leftmost digit(s) in the corresponding reference 
number. 

[0022] FIG. 1 is a high level block diagram of a DSP-based receiver, in 

accordance with an aspect of the present invention. 
[0023] FIG. 2 illustrates an example analog phase interpolator that can be 

implemented with the digital timing recovery system illustrated in FIG. 10, in 

accordance with an aspect of the invention. 



BP 1489 



SKGFRef.: 1875.1280001 



-6- 



[0024] FIG. 3A is a block diagram of an example parallel receiver, including 

an N-path ADC and an M-path DSP, in accordance with an aspect of the 
invention. 

[0025] FIG. 3B is a more detailed block diagram of an example receiver in 

accordance with an aspect of the invention. 
[0026] FIG. 3C is a block diagram of individual timing recovery loops that 

can be implemented for the N ADC paths illustrated in FIG. 3 A or 3B. 
!sf [0027] FIG. 3D illustrates an embodiment where the timing recovery module 

Ul receives M decisions and M errors from the M DSP paths, in accordance with 

€j an aspect of the invention. 

[0028] FIG. 3E illustrates an embodiment where each timing recovery loop 



Q includes a phase locked loop and k phase detectors, in accordance with an 

IS. 

1.JJ 

aspect of the invention. 

5j£ [0029] FIG. 3F illustrates an example embodiment where each timing 

|1| recovery loop includes a phase locked loop and 1 phase detector, in 

accordance with an aspect of the invention. This is a special case where k=l. 
[0030] FIG. 3G illustrates an example embodiment where each timing 

recovery loop includes a phase locked loop and 2 phase detectors, (k=2), in 

accordance with an aspect of the invention. 
[0031] FIG. 3H illustrates an example implementation wherein the timing 

recovery module includes a decoder and a phase selector/phase interpolator, in 

accordance with an aspect of the invention. 
[0032] FIG. 4A is a block diagram of an example receiver that utilizes a track 

and hold device, in accordance with an aspect of the invention. 
[0033] FIG. 4B is a block diagram of an example receiver that utilizes 

multiple track and hold devices in parallel, in accordance with an aspect of the 

invention. 

[0034] FIG. 5 illustrates an example parallel receiver that utilizes, among 

other things, DFE-based offset cancellation on a per path basis, in accordance 
with an aspect of the invention. 
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[0035] FIG. 6 illustrates example implementation details of the equalizer 

illustrated in FIG. 5, in accordance with an aspect of the present invention. 
[0036] FIG. 7 illustrates an example programmable gain amplifier and an 

example automatic gain control module, in accordance with an aspect of the 
present invention. 

[0037] FIG. 8A illustrates an example implementation for offset mismatch 

compensation in accordance with an aspect of the present invention. 
[0038] FIG. 8B illustrates an example Viterbi decoder-based decision-directed 

33 error signal generator, in accordance with an aspect of the invention. 

[0039] FIG. 9 illustrates another example implementation for offset mismatch 

Nl compensation, in accordance with an aspect of the present invention. 

[0040] FIG. 10 is a block diagram of a parallel receiver with independent 

timing recovery loops for each parallel path, in accordance with an aspect of 
the invention. 

[0041] FIG. 11 is a block diagram of an example timing recovery block in 

accordance with an aspect of the invention. 
[0042] FIG. 12 illustrates an example analog phase interpolator that can be 

implemented with the digital timing recovery system illustrated in FIG. 10, in 
accordance with an aspect of the invention; 
[0043] FIG. 13 illustrates an example 4-state, 1-step trellis that runs at a clock 

rate substantially equal to the symbol rate, in accordance with an aspect of the 
present invention. 

[0044] FIG. 14 illustrates an example 4-state, M-step trellis that runs at a 

clock rate substantially equal to 1/M th of the symbol rate, in accordance with 
an aspect of the present invention. 
[0045] FIG. 15A illustrates an example rooted trellis, in accordance with an 

aspect of the present invention. 
[0046] FIG. 15B illustrates another example rooted trellis, in accordance with 

an aspect of the present invention. 
[0047] FIG. 15C illustrates another example rooted trellis, in accordance with 

an aspect of the present invention. 
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[0048] FIG. 15D illustrates another example rooted trellis, in accordance with 

an aspect of the present invention. 
[0049] FIG. 16 illustrates an example systolic implementation of rooted trellis 

computation, in accordance with an aspect the present invention. 
[0050] FIG. 17 is a high-level block diagram of an example parallel Viterbi 

processor in accordance with an aspect the present invention. 
[0051] FIG. 18 is a process flowchart in accordance with an aspect of the 

invention. 
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I. Introduction 

A. Receivers and Transceivers 

[0052] The present invention is directed to receivers and, more particularly, to 

digital signal processing ("DSP") based receivers, multi-channel receivers, 
timing recovery schemes, and equalization schemes. Various features in 
accordance with the present invention are described herein. The various 
features can generally be implemented alone and/or in various combinations 
with one another. Example implementations of various combinations of 
features of the invention are provided herein. The invention is not, however, 
limited to these examples. Based on the description herein, one skilled in the 
relevant art(s) will understand that the features described herein can be 
practiced alone and or in other combinations as well. 
[0053] FIG. 1 is a high-level block diagram of an example DSP-based receiver 

100, in accordance with the present invention. The DSP-based receiver 100 
receives a data signal 102 through a transmission medium 112 and converts it 
to a digital data signal 106. 
[0054] The DSP-based receiver 100 includes an analog-to-digital converter 

("ADC") 108 that digitizes the data signal 102 and outputs one or more 
internal digital signals 104. The DSP-based receiver 100 also includes a DSP 
110 that performs one or more digital signal processes on the one or more 
digital signals 104, and outputs one or more digital output signals 106. 
[0055] DSP processes in accordance with the present invention are described 

below, which can include, without limitation, equalization, error correction 
(such as hard or soft decoding of, without limitation, convolution^, trellis, or 
block codes), timing recovery, automatic gain control, and offset 
compensation. Analog circuitry (not shown in FIG. 1) is optionally provided 
to perform portions of one or more of these functions. 
[0056] In an embodiment, the ADC 108 and/or the DSP 110 are implemented 

with multiple parallel paths, wherein each parallel path operates at a lower 
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speed relative to the data signal 102. In an embodiment, the parallel paths are 
operated in an interleaved fashion as described below. In an embodiment, the 
ADC 108 is configured with N parallel paths and the DSP 110 is configured 
with M parallel paths, where M = kN, wherein k is an integer or a number in 
the form of 1/s, where s is an integer. In parallel implementations, one or 
more DSP and/or analog processes, including, without limitation, one or more 
processes that compensate for nonidealities in the analog front-end paths, can 
be performed on a per path basis, as described below. 

B. Equalization 

[0057] Optional equalization of data signals is now described with respect to 

FIG. 1. During operation of the DSP-based receiver 100, the data signal 102 is 
received by the receiver 100 through the transmission medium 112. During 
transmission through the transmission medium 112, the data signal 102 is 
typically impaired, due to inter-symbol interference, attenuation, crosstalk, 
noise, and possibly other impairments. These impairments are typically a 
function of, among other things, physical properties and the length of the 
transmission medium 112. These impairments are said to reduce the "eye 
opening" of the data signal 102, making it more difficult to accurately process 
the data signal 102. 

[0058] In an embodiment, the receiver 100 includes one or more equalizers 

(not shown), which may include, without limitation, linear equalizers and /or 
non-linear equalizers. The one or more equalizers improve the "eye opening" 
of the data signal 102. The present invention provides parallel and non-parallel 
equalization embodiments. 
[0059] In an embodiment the one or more equalizers perform one or more of 

the following types of equalization: 

feed forward equalization ("FFE"); 

Viterbi equalization; and/or 

decision feedback equalization ("DFE"). 
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[0060] In accordance with an aspect of the invention, equalization, including 

linear and/or non-linear equalization, is performed. 
[0061] In an embodiment, error correction such as, without limitation, hard or 

soft decoding of convolutional, trellis, or block codes is implemented in a 

multi-path receiver. 

[0062] Example implementations in accordance with aspects of the invention 

are described below. Any of a variety of conventional parallel implementation 
techniques and/or new techniques in accordance with the invention, or 
combinations thereof, can be implemented in a parallel multi-path receiver. 

[0063] It is important not to confuse the concept of "multi-path receiver" with 

the concept of multiple receivers operating concurrently. In the context of this 
disclosure, "multi-path receiver" refers to a receiver where a single input data 
signal is digitized by an array of interleaved ADCs and/or processed by a 
digital signal processor using a parallel implementation, as shown in Figures 
3A and 3B. 

[0064] The examples herein are provided for illustrative purposes. The 

invention is not limited to these examples. 

II. High Speed, DSP-Based Receiver 

[0065] In accordance with an aspect of the invention, the receiver 100 is 

implemented as a high speed, or high data rate, DSP-based receiver that 
receives and digitally processes high data rate data signals 102. High data rate 
signals generally include data signals in the multi-giga bits per second range. 

[0066] Generally, a high data rate receiver 100, having a high data rate ADC 

108 and a high speed DSP 110, would require one or more high speed (e.g., 
gigahertz range) clocks. To facilitate implementation on a chip for high data 
rates, in accordance with an aspect of the invention, parallel processing is 
implemented wherein each parallel path operates at a lower clock rate. 
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A. Parallel ADC and DSP 



[0067] FIG. 3 A illustrates the receiver 100 implemented as a parallel 

receiver, wherein the ADC 108 is implemented as an array of N ADCs 312-1 
through 312-N, and the DSP 110 is implemented with M parallel paths 314-1 
through 314-M, where M=kN. The N ADCs 312-1 through 312-N and the 
M DSP paths 314-1 through 314-M operate at lower data rates than the 
received data signal 102. It is important to observe that the DSP paths need 
not be independent from one another. In other words, there could be cross- 
connections among the different DSP paths 314-1 through 314-M. 
M* [0068] In an example embodiment, M=N=4 (i.e., k=l). Other embodiments 

gj use other values for N, M, and k. Motivations to use other values of k, for 

jjf example k=2, include, without limitation, further reducing the clock speed to 

! SK5P 

|«* operate DSP blocks in the receiver. This can be the situation, for example, 

m when implementing complicated algorithms requiring elaborate DSP 

architectures. In all the examples provided in this disclosure it is assumed that 
M is larger than or equal to N, therefore k is larger than or equal to one. 
However, it will be apparent to one skilled in the art that other embodiments 
where N is larger than M are also possible without departing from the spirit 
and scope of the present invention. This situation could arise, for example, if 
high-resolution ADCs were needed. In general there is a tradeoff between 
speed and resolution in the design of the ADC. Therefore in an application 
where high resolution ADCs are necessary, the speed of each path would be 
lower and the number of ADC paths required would increase. This could lead 
to a situation where N is larger than M. In this case it is generally not possible 
to compensate errors in all ADCs individually, but only in groups of N/M of 
them. Otherwise, the techniques disclosed herein can be applied equally well 
in this situation. However, for simplicity of description, the examples provided 
in this disclosure use M larger than or equal to N. 
[0069] In FIG. 3A, the data signal 102 is received and digitized into a 

plurality of N parallel signals 104-1 through 104-N by the array of N lower 
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speedADCs 312-1 through 312-N. The ADC s 312-1 through 312-Ncanbe 
single-bit ADCs or multi-bit ADCs. Each of the plurality of digitized parallel 
signals 104-1 through 104-N typically have a sampling rate lower than the 
symbol rate of the received data signal 102, but taken together, have a 
sampling rate substantially the same or higher than the symbol rate of the 
received data signal 102. In an embodiment, the received data signal 102 is a 
high data rate (e.g., gigabit(s) per second range) data signal. If the modulation 
scheme is binary (it encodes only one bit per symbol) the symbol rate is 
substantially equal to the data rate. The symbol rate can be reduced without 
reducing the data rate by using multilevel modulation schemes such as pulse 
amplitude modulation (PAM). For example, two bits per symbol could be 
transmitted by using a 4-level PAM modulation scheme (PAM-4). A binary 
modulation scheme is also known as PAM-2 (other common names are On- 
Off Keying (OOK) or binary antipodal signaling). In order to properly 
recover the data transmitted from the remote end, the receiver needs to take at 
least one sample per symbol of the received signal. These types of receivers 
are usually called "baud-rate-sampled receivers." However in some 
implementations the receiver could take more than one sample per symbol. 
These receivers are often called "oversampled receivers," or "fractionally- 
spaced receivers." Baud-rate-sampled receivers are usually more economical 
because, for the same symbol rate, they require lower speed ADCs than 
oversampled receivers. However, it will be apparent to one skilled in the art 
that the techniques disclosed in this invention can be applied equally well to 
baud rate sampled and/or oversampled receivers, as well as to receivers using 
a variety of modulation schemes, including, but not restricted to, PAM-2, 
multilevel PAM, single-carrier or multi-carrier quadrature amplitude 
modulation (QAM), etc. 
[0070] A timing recovery module 318 performs timing recovery and provides 

one or more clock signals 319 to the ADC converter array 108. In an 
embodiment, the timing recovery module 318 operates the N lower speed 
ADCs 312-1 through 312-N in a staggered, or interleaved fashion. In other 
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words, different phases of the clock signals 319 are provided to each of the 
ADCs 312-1 through 312-N. The different phases are staggered from one 
another so that each ADC 312-1 through 312-N samples a different portion 
or phase of the data signal 102. Interleaved samples 104-1 through 104-N 
from the ADCs 312-1 through 312-N are aligned by a retiming module 316. 
Further signal processing is performed in the M-path DSP 1 10. 

[0071] Example operation of the DSP-based parallel receiver 100 illustrated in 

FIG. 3A is now described for a case where the data signal 102 is a 10 gigabit 
per second data signal and the ADC converter array 108 includes eight ADCs 
312 (in other words, N=8 in this example), each operating at approximately 
1250 MHz. The timing recovery module 318 outputs a 1250 MHz, eight- 
phase clock signal 319 on a bus, one phase for each of the ADCs 312-1 
through 312-N. The eight-phase clock signal 319 operates the ADCs 312-1 
through 312-N at 1250 MHz, separated in phase from one another by 45 
degrees (i.e., 360 degrees/8 phases), in this example. 

[0072] A parallel DSP-based receiver in accordance with the invention is 

useful for receiving high data rate signals. A high data rate DSP-based 
receiver in accordance with the invention is useful for lower data rate 
applications as well. 

[0073] In an embodiment, the timing recovery module 318 includes an 

individual timing recovery loop for each of the ADC paths defined by the 
ADCs 312-1 through 312-N. Individual timing recovery loops are described 
below. 

[0074] FIG. 3B illustrates an example implementation of the parallel DSP- 

based receiver 100 illustrated in FIG. 3A, wherein the ADC 108 is a 4-path 
ADC 108 and the DSP 110 is an 8-path DSP 110 (i.e., N=4, M=8, and k=2). 
The example 8-path DSP 1 10 includes an 8-path parallel FFE 320 and an 8- 
path parallel Viterbi decoder 322. Example implementations of parallel 
Viterbi decoders are described below. Additional example implementations of 
the M-path DSP 110 are provided below. The present invention is not, 
however, limited to these examples. Based on the description herein, one 
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skilled in the relevant art(s) will understand that other N-path ADC and/or 
M~path DSP configurations are possible. 
[0075] In FIG. 3B, the retiming module 316 provides samples of the retimed 

signals to the parallel feedforward equalizer 320, as well as to the timing 
recovery module 318 and to the AGC 310, as illustrated by the dotted lines. 
[0076] In FIG. 3B the receiver 100 is illustrated with a programmable gain 

amplifier 308 and an automatic gain control 310. Implementation examples 
and operation of these components are described below. 
[0077] In an embodiment, a parallel receiver in accordance with the invention 

is designed to receive a single data signal. Alternatively, a parallel receiver in 
accordance with the invention is designed to receive multiple data signals. In 
»l such an embodiment, the receiver 100 is repeated for each data signal 102. 

Each repetition of a parallel multi-path DSP-based receiver is referred to 
y= herein as a slice, each slice having one or more parallel ADC and/or DSP 

J! paths, 

[0078] In an embodiment, the receiver 100 illustrated in FIG. 1, is 

implemented with one or more track and hold devices. For example, FIG. 4A 
illustrates a block diagram of a portion of an example receiver including a 
track-and-hold device 402 controlled by a clock generator 404. The track 
and hold device 402 provides a constant analog value to the ADC 108. 

[0079] In an embodiment, the multi-path receiver 100 illustrated in FIG. 1, is 

implemented with a plurality of track and hold devices. FIG. 4B illustrates a 
block diagram of a portion of an example parallel receiver including an array 
408 of parallel track and hold devices 406-1 through 406-N. 



III. Design and Control Considerations 



[0080] In accordance with parallel multi-path receiver aspects of the 

invention, one or more of a variety of types of gain and/or phase errors and 
interleave path mismatches are detected and compensated for. Such errors and 
mismatches can be compensated for on a path-by-path basis and/or on a 
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system wide basis. Compensation design and control considerations for 
parallel receivers are now described. 

[0081] In accordance with the invention, one or more adaptive processes 

reduce error. Error is used in one or more feedback loops, for example, to 
generate equalizer coefficients, to optimize ADC sampling phase(s) for timing 
recovery, and/or to optimize gain for automatic gain control ("AGC"). Error 
correction can be used for other processes as well. 

[0082] Error can be computed in one or more of a variety of ways. For 

example, error can be computed as a difference between input signals and 
decisions as to the values of the input signals. This is referred to herein as a 
decision-directed process. Decision-directed processes can be implemented 
with a sheer. Alternatively, decision-directed processes can be implemented 
with a Viterbi Decoder, as described below with respect to FIG. 8. Other 
decision-directed processes can be used as well. Other error determination 
processes can also be used. 

[0083] Examples provided herein typically illustrate timing recovery, AGC, 

and offset cancellation algorithms as decision-directed processes, where error 
is computed at a slicer or equivalent decision device, such as Viterbi decoder. 
The examples are provided for illustrative purposes and are not limiting. 
Based on the teachings herein, one skilled in the relevant art(s) will understand 
that the techniques can be implemented with non-decision-directed processes 
as well, and/or in combinations of decision-directed and non-decision-directed 
processes. 

A. Path-Based Timing Recovery and Phase Error Compensation 

[0084] Referring to FIG. 3A, in an interleaved embodiment, the multi-phase 

sampling clock 319 provided by the clock recovery module 318 is generated 
by dividing down a higher frequency clock. Imperfections in the clock 
dividing circuitry, however, potentially lead to phase differences between the 
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paths that depart from the intended value. This error has a systematic 
component and a random component. 

[0085] Most of the random component typically originates in the random jitter 

of the high-frequency clock from which the N-phase sampling clock 319 is 
derived. Therefore the random error component tends to be approximately 
similar for the N interleaved ADCs. 

[0086] The systematic component of the sampling phase error, however, tends 

to originate in a divider circuit, typically implemented within a timing 
recovery module, such as the timing recovery module 318 illustrated in FIG. 
3A, and also in mismatches in the propagation delays of the clocks from the 
timing recovery module to the individual track-and-hold devices (as shown in 
Figure 4B, there is a track-and-hold device 406-1 through 406-N in front of 
each ADC 312-1 through 312-N). Therefore, the sampling instants of the 
input signal experience a periodic jitter with a fundamental frequency fs, 
where fs is the frequency of the sampling clock driving each track and hold. 
When looking at the digital samples of the complete interleaved array, the 
effect of these systematic sampling phase errors is an error in amplitude of the 
digitized samples. This error is detrimental to the accuracy of the ADC 
converter array 108, and it can be a performance-limiting factor. 

[0087] In accordance with an aspect of the invention, therefore, methods and 

systems are now described for reducing systematic jitter. The methods and 
systems are based on the M-parallel DSP paths described above, which makes 
it possible to separate the timing recovery module 318 into N loops, each loop 
responding to a phase error in a corresponding data path, which can then be 
compensated for in the corresponding N timing recovery loops. 

[0088] HG. 3C illustrates an example implementation of the timing recovery 

module 318 including multiple timing recovery loops 318-1 through 318-N. 
Example implementations of the multiple timing recovery loops 318-1 
through 318-N are provided below. 

[0089] An advantage of separate timing recovery loops is that the systematic 

phase errors introduced in the multi-phase sampling clock 319 by the 
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frequency divider circuit can be independently compensated within the N 
independent timing recovery loops 318-1 through 318-N. This technique 
substantially reduces and/or eliminates the systematic component of the phase 
error in the interleaved ADC converter array 108, providing increased 
accuracy and ease of design. The systems and methods for compensating 
sampling phase errors described herein can be used in combination with one or 
more of a variety of timing recovery techniques. 

1 . Decision-Directed Timing Recovery 

[0090] In an embodiment, the DSP-based receiver 100 utilizes one or more 

decision-directed timing recovery processes. For example, FIG. 3D illustrates 
an embodiment where the timing recovery module 318 receives M decisions 
324 and M errors 326 from the M DSP paths. The significance and use of the 
decisions 324 and errors 326 are described below. 

[0091] FIG. 3E illustrates an embodiment where each timing recovery loop 

318-1 through 318-N includes a phase locked loop (PLL) 332 and k phase 
detectors 330. Recall that k relates the number of ADC paths N to the 
number of DSP paths M, where M=kN . Example implementations of the 
phase locked loop 332 and k phase detectors 330 are described below with 
respect to FIG. 11. 

[0092] The M decisions 324 and M errors 326 can be utilized by the timing 

recovery loops 318-1 through 318-N in a variety of ways, depending upon 
the number of ADC paths N and the number of DSP paths M. In other words, 
based upon the value of k. For example, FIG. 3F illustrates an example 
implementation for k=l. FIG. 3G illustrates an example implementation for 
other values of k. These example implementations are described below with 
respect to FIGS. 10 and 11. 

[0093] FIG. 3H illustrates an example implementation wherein the timing 

recovery module 318 includes a decoder 340 and a phase selector/phase 
interpolator 342. The phase selector/phase interpolator 342 receives P phases 
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344-1 through 344-P, where P is an integer, from a clock generator. The 
phase selector/phase interpolator 342 also receives N phase interpolator 
control signals 346-1 through 346-N from the decoder 340, Alternatively, 
the phase selector/phase interpolator 342 receives the N phase interpolator 
control signals 346-1 through 346-N directly from the timing recovery loops 
318-1 through 318-N. 

[0094] The phase selector/phase interpolator 342 outputs N phases 319-1 

through 319-N. P does not necessarily equal N. For example, in an 
embodiment, P=4 and N=8. In another embodiment, P=N=4. The invention is 
not, however, limited to these examples. Based on the description herein, one 
skilled in the relevant art(s) will understand that other values for N and P can 
be used. Example implementations of the phase selector/phase interpolator 
342 are described below with respect to FIGS. 2 and 12. 

[0095] FIG. 10 illustrates an example implementation of the timing recovery 

loops 318-1 through 318N wherein each timing recovery loop 318-1 through 
318-N receives a decision from a corresponding DSP path and a sample of the 
sheer error from an adjacent DSP path. This configuration is described below 
with respect to FIG. 11. Each timing recovery loop 318-1 through 318N is 
designed to drive its associated path phase error towards zero. 

[0096] In the embodiment of FIG. 10, the M-path DSP 110 includes an FFE 

1004, a DFE 1006, and slicers 1002-1 through 1002-M. Decisions and slicer 
error signals are shown as being taken from slicers 1002-1 through 1002-M. 
Phase error signals are computed by the timing recovery modules 318-1 
through 318N, based on the decisions and the slicer errors, as shown in more 
detail in FIG. 11. This corresponds to an exemplary decision-directed timing 
recovery algorithm. However, other timing recovery algorithms can be 
utilized. 

[0097] In the example of FIG. 10, decisions are generated from slicers 1002, 

and errors are generated as a difference between the slicer decisions and the 
input to the slicers 1002. Alternatively, decisions and errors are generated 
with a Viterbi decoder and channel estimator. For example, in FIG. 8B, a 
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Viterbi decoder 804 receives an input signal 810 through a feed-forward 
equalizer 812, and outputs decisions 806, which can be final decisions or 
tentative decisions. Tentative decisions can be provided by the Viterbi 
decoder 804 with less delay than final decisions, while final decisions tend to 
be more accurate than tentative decisions. The choice between tentative 
decisions and final decisions is generally a trade-off between latency and 
:U accuracy. The choice can be influenced by the quality of the input signal 810. 

n 

The decisions 806 are provided to a channel estimator 808, the output of 
0? which is subtracted from the input signal 810. The resulting error is analogous 



h4> 



□ 

1 w 



to the slicer error described above. 

[0098] FIG. 1 1 is a block diagram of an example implementation of the timing 

recovery loop 318-1 illustrated in FIGS. 3C-3H and FIG. 10. Timing recovery 
loops 318-2 through 318-N are similarly configured. In FIG. 11, the timing 
recovery loop 318-1 includes k phase detectors 1104-1 through 1104-k, which 
generate k phase error signals 1106-1 through 1106-k. Each phase error signal 
1106-1 through 1106-k is generated by cross-correlating a decision 1110 for a 
given path with a slicer error 1108 corresponding to an adjacent path, as 
illustrated in FIGS. 3F and 3G, for example 

[0099] The phase error signals 1106-1 through 1106-k are computed in the 

exemplary embodiment of FIG. 11, by, for example, using a variety of the 
well-known Mueller and Muller algorithms. See, for example, K.H. Mueller 
and M. Muller, 'Timing Recovery in Digital Synchronous Data Receivers," 
IEEE Transactions on Communications COM-24, pp.5 16-531, May 1976, 
incorporated herein by reference in its entirety, where the phase error is based 
on the precursor of the channel impulse response at the output of the FEE, 
with the precursor taken one symbol period before the sample on which the 
decision is based. In this algorithm, the phase error is computed with the 
slicer error delayed by one symbol period. In a serial implementation this is 
achieved, for example, by introducing a pipeline register clocked at the 
symbol rate in the error path going to the phase detector. In a parallel- 
processing implementation, the one symbol delay of the error is achieved by, 
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for example, taking the error sample from an adjacent path, as shown in 
FIG. 10. In other words, the decision 1110 comes from the same path where 
phase is being controlled, but the error 1108 comes from the adjacent path 
corresponding to the samples of the input signal taken one baud period earlier. 
Because of the parallel architecture of the DSP, these samples appear at the 
same cycle of the DSP clock, but on an adjacent path. 
[00100] A delay 350 is inserted in the error 1 108-1 because the error M 1 108-1 
comes from a preceding block relative to the decision 1110-1. The delay 350 

Ui is substantially equal to M cycles of the input or baud clock, or one cycle of 

O 

the DSP clock. For example, where the data signal 102 is a 10 Gbit/sec signal, 
and where M equals 4 (i.e., 4 DSP paths), the delay 350 is set to 1/4 of 10 

Is 

Q Gbits/sec, or approximately 400 picoseconds. 

Qj 

[00101] The phase error signals 1106-1 through 1106-k are filtered by an 
accumulate and dump filter 1112 and further filtered by an integral filter 1118. 
The sum of the proportional and integral paths is used to control a numerically 
controlled oscillator ("NCO") 1114. Therefore, the phase locked loop 
illustrated by FIG. 1 1 is a second-order (or proportional plus integral) loop. 
Digital control words 1116 generated by the NCO 1114 are used to control a 
phase selector (not shown in FIG. 1 1). 



2. Phase Selector 



[00102] In an embodiment, phase compensation is performed with a phase 
interpolator or phase selector. In an embodiment, the phase selector digitally 
generates multi-phase sampling clocks by, for example, taking a weighted sum 
of multiple (e.g., 4), phases with finite rise and fall times. FIGS. 2 and 12 
illustrate example phase selectors in accordance with aspects of the invention. 
The example phase selector in FIG. 2 generally provides faster response times. 
Alternatively, a conventional phase selector is utilized. The present invention 
is not, however, limited to digitally controlled phase selectors. 
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a. DAC-Based Phase Selector 

[00103] FIG. 2 illustrates an example phase selector 202 in accordance with an 
aspect of the invention. The phase selector 202 shown in FIG. 2 exemplifies a 
situation where the number of output phases fsi through fsN may be different 
from the number of input phases f si through f s P . The number of output 
phases fsi through fs^ is always N, the same as the number of ADC paths. 
However the number P of input phases Fsi through f Sp could be smaller than 
N. In an embodiment, N is a multiple of P. 

[00104] The phase selector 202 includes N interpolator sub-blocks 202-1 
through 202-N, that receive digital control words d through Cn, , respectively. 
The digital control words Ci through C N , correspond to the phase interpolator 
control signals 346-1 through 346-N described above with respect to FIG. 3H. 

[00105] In FIG. 2, phase interpolator sub-block 202-1 is illustrated in detail, 
operation of which is now described. The digital control word Ci is applied 
through a decoder to current-mode digital-to-analog converters ("DACs") 204- 
1 through 204-P, which control the bias current of respective differential pairs 
208-1 through 208-P. The inputs to the differential pairs 208-1 through 208-P 
are taken from consecutive input phases. The drain currents of the differential 
pairs 208-1 through 208-P are combined in output resistors 212 and 214, 
which generate the output phase fsi. The output phase fsi is thus a weighted 
sum of fsi through fsp, wherein the weighting is determined by the DACs 
204-1 through 204-P, under control of the control signal Ci. 

[00106] There are N phase interpolator sub-blocks 202-1 through 202-N, each 
one corresponding to an output phase. The number of input phases P is 
typically smaller than the number of output phases, N. It must be noted that, 
although the circuit shown in FIG.2 uses particular components such as 
NMOS transistors and resistors, there are many alternative implementations, 
including, but not limited to, FET or BJT circuits in other integrated circuit 
technologies such as silicon germanium, indium phosphide, gallium arsenide, 
etc. The essential aspect of this phase selector 202 is the use of digitally 
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con trolled weighted sums of two input phases to generate an output phase. 
This concept can be implemented in many alternative ways without departing 
from the spirit and scope of the present invention, as will be apparent to one 
skilled in the art. 

b. Resistive Interpolation Ring 

[00107] In an embodiment, multi -phase sampling clocks 319 are generated by a 
resistive phase interpolator. FIG. 12 illustrates an example timing recovery 
block 1202 implementation, which is an example embodiment of the timing 
recovery block 318 illustrated in FIG. 10. The timing recovery block 1202 
includes a resistive interpolation ring phase selector 1204. Input phases f'si.N 
1206 from a clock generator are provided to the resistive interpolation ring 
phase selector 1204. In an embodiment, the input phases Fsi_n 1206 are 
derived from a divider operating on an independent clock. When the 
frequency of operation of the divided down clock is relatively high, the clock 
edges tend to have finite rise and fall times that are comparable to the period 
of the waveform. The number of input phases P need not be the same as the 
number of ADC paths N. This is explained more clearly in connection with 
FIG. 2. 

[00108] By interpolating between two such waveforms of phase difference 

corresponding to a quarter of a period, new waveforms, fsj.N , with phase 
differences corresponding to fractions of, for example, a quarter of a period 
from the original signals f Si_n 1206 are obtained. In an embodiment, the 
phase difference is electrically controlled by changing the relative 
interpolation factors by, for example, changing the values of the interpolation 
resistors in a digital fashion, driven by, for example, the timing recovery 
circuit. 

[00109] The example phase selector implementations described herein are 

provided for illustrative purposes. The present invention is not limited to 
these examples. Based on the teachings herein, one skilled in the relevant 
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art(s) will understand that other phase selector methods and systems can be 
utilized. 

B. Gain and Offset Mismatch Compensation 

[00110] In accordance with an aspect of the invention, methods and systems are 
provided for reducing gain errors, offsets, and/or undesired sampling clock 
phase differences among the paths defined by the ADCs 312-1 through 312-N 
(FIG. 3A). 

1 . DSP-Based Adaptive Path Gain and Offset Mismatch 

Control 

[00111] In accordance with an aspect of the invention, gain and offset 
mismatches between paths are compensated for in a DSP, wherein gain factors 
adapt for individual paths. 

[00112] FIG. 5 illustrates an example DSP-based parallel receiver 500, which is 
an example implementation of the receiver 100 illustrated in FIG. 3 A. The 
receiver 500 utilizes DFE-based offset cancellation on a per path basis, in 
accordance with an aspect of the invention. Under this approach, offsets 
originating in the ADC 108 or anywhere in the analog front end are 
individually controlled for each ADC path by an equalizer adaptation 
algorithm to compensate the offsets in the digital domain independently for 
each path. In the embodiment of FIG. 5, a single Programmable Gain 
Amplifier 308 with global gain control is shown. As will be discussed later, 
independent gain control for each ADC path can also be implemented in the 
digital domain using, for example, Feed-forward Equalizer. FFE-based digital 
control can be omitted where, for example, the gain errors of the ADC paths 
can be accurately controlled by design, thus requiring little or no digital gain 
mismatch compensation. In a more common situation, relatively significant 
gain mismatches exist among the ADC paths, therefore digital compensation 
of gain mismatches is preferred. A scheme where gain mismatches in the 
ADC paths are individually compensated in the analog domain will be 
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discussed later in connection with FIG. 7. Alternatively, gain mismatches can 
be digitally compensated using the Feed-forward Equalizer. FIG. 5 also shows 
the independent phase error compensation technique already discussed in 
connection with FIGS. 10, 11, and 12. It will be apparent to one skilled in the 
art that the sampling phase error, gain error, and offset compensation 
techniques disclosed herein can be used independently of each other and in 
any combination required, depending on the need for compensation of the 
different errors that circuit design and/or manufacturing tolerance 
considerations motivate in each specific situation. 
[00113] In FIG. 5, the M-path DSP 110 includes an M-path parallel FFE 508, 
M individual decision and error paths, and an M-path DFE 510. In an 
embodiment, the number of parallel ADC paths N equals the number of 
parallel DSP paths M. The invention is not, however, limited to this 
embodiment. 

[00114] The example parallel receiver 500 shows an implementation of a DFE 
and offset cancellation scheme that can not only compensate for offset, but can 
also compensate for offset mismatches among the interleaved array of ADC 
paths. In an embodiment, the offset cancellation scheme is implemented with 
one or more DC taps per ADC path in the DFE 510. This approach is 
described in more detail in FIG. 9, where the DC taps are implemented by the 
integrators inside blocks 902-1 through 902-M. FIG. 8 also uses DC taps in 
the DFE to compensate for offsets independently for each ADC path, but in 
this case compensation is done in the analog domain. Since each interleave 
uses an independent, and independently adapted, DC tap, offsets that do not 
necessarily match across the interleaved paths can be compensated. 

[00115] In FIG. 5, the timing recovery module 318 receives decisions and 
errors from the M individual decision and error paths in the DSP 110, and 
adjusts the phases of the sampling clocks 319-1 through 319-N accordingly. 

[00116] In the receiver 500, gain factors are individually controlled for each 
path after the ADC array 108. Overall dynamic range of the ADC converter 
array 108 is optionally controlled by the AGC module 310 and the PGA 
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module 308. This helps to optimize use of all of the bits of the ADC array 
108. 

[00117] FTG. 6 illustrates an example of a 4-tap adaptive FEE 508 implemented 
as a 4-parallel array having paths 602-1 through 602-4. The number of taps 
and the degree of parallelization can be varied as desired. In the example 
implementation of FIG. 6, the parallel paths 602-1 through 602-4are 
essentially four adaptive transversal filters. 

[00118] For an ideal channel (i.e., a channel where there are no gain 
mismatches in the paths), it would be economical to share the coefficients of 
the filters in the paths 602-1 through 602-4. In other words, it would be 
economical to make a r (0) = a r (1) = a r (2) = a r (3) (r=0,...,3) in FIG. 6. In practice, 
however, gain mismatches typically occur. By making the coefficients 
independent of one another, and adapting them independently, the coefficients 
of the M-paths will individually converge to potentially different values to 
compensate for gain errors of the lower frequency ADC s 312-1 through 312- 
N. 

[00119] In addition to reducing gain mismatches in the paths, independent 
adaptation of the gain coefficients tends to reduce bandwidth mismatches in 
the paths, which otherwise could cause impulse responses of the paths to differ 
from one another. 

[00120] The FFE can also act as an interpolation filter. Having independent 

coefficients for the different parallel sections, as explained before, means that 
the FFE can also compensate for sampling phase errors in the ADCs. This is 
particularly true when the input signal is bandlimited to half the baud rate or 
less. This provides an alternative way to compensate for sampling phase 
errors, as well as gain errors in the ADCs of an interleaved array. 

2. Automatic Gain Control (AGC) 

[00121] In accordance with an aspect of the invention, gain errors in the 

interleaved ADC paths are compensated for on a path by path basis, using 
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path-specific AGCs, wherein gain factors adapt for individual paths. FIG. 7 
illustrates an example path-specific AGC implementation, which can be 
utilized to reduce gain errors in the interleaved paths. The example path 
specific AGC implementation illustrated in FIG. 7 can be implemented in 
place of the FFE-based gain error compensation scheme illustrated in FIGs. 5 
and 6. It can also be combined with offset compensation schemes like the ones 
discussed in connection with FIGS. 5 and 8. 

[00122] FIG. 7 illustrates an implementation of a portion 700 of the receiver 
100 illustrated in FIG. 3 A, in accordance with an aspect of the invention. The 
portion 700 includes a plurality of path-specific AGCs 310-1 through 310-N, 
which control a PGA array of path-specific PGAs 308-1 through 308-N. 

[00123] Path-specific AGCs 310-1 through 310-N are now described with 
reference to path-specific AGC 310-1. Path-specific AGCs 310-2 through 
310-N are configured similarly. Path-specific AGC 310-1 includes an 
absolute value module 704-1 and a lowpass filter 706-1, which provides a 
measured amplitude 708-1 to a differencer 726-1. The differencer 726-1 
subtracts a desired amplitude 712-1 from the measured amplitude 708-1 and 
outputs a difference value 714-1 to an adder 716-1. The adder 716-1 together 
with the accumulator 722-1 constitute a digital integrator. The integrator 
integrates the difference value 714-1 and outputs a PGA control value 724-1 to 
PGA 308-1. PGA control value 724-1, or a portion thereof, is optionally 
provided to ADC 312-1 to adjust a reference voltage therein. Path-specific 
AGCs 310-2 through 310-N operate in a similar fashion. 

[00124] In the example of FIG. 7, gain errors are obtained or generated in the 
digital domain, and used to control the independent PGAs 308-1 through 308- 
N. Since the gain error is measured in the digital domain, any gain errors 
introduced by the lower frequency ADCs 312-1 through 312-N will be driven 
to approximately zero by the AGC circuitry. 

[00125] The present invention is not, however, limited to this example. Based 

on the description herein, one skilled in the relevant art(s) will understand that 
automatic gain control can be implemented in other ways. For example, and 
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without limitation, where gain mismatches of the interleaved ADC paths are 
relatively negligible, automatic gain control can be shared by all of the ADC 
paths, wherein the PGAs 308-1 through 308-N share a common control signal. 

3. Analog Compensation 

[00126] FIG. 8 illustrates an example implementation for gain and offset 

mismatch compensation, where offset associated with each ADC 312-1 
through 312-N in the interleaved ADC array 108 is substantially cancelled in 
the analog domain. Analog cancellation can be utilized in place of, or in 
addition to digital cancellation. Offsets introduced by each of the lower 
frequency ADCs 312-1 through 312-N are preferably measured in the digital 
domain. Alternatively, offsets introduced by each of the lower frequency 
ADCs 312-1 through 312-N are measured in the analog domain. 

[00127] In a similar way, the gain errors can be compensated for by controlling 
the reference voltage of the ADCs. In this case, the PGA can be shared across 
all the interleaves. 

4. Alternative Implementations 

[00128] FIG. 9 illustrates an exemplary receiver implementation that 
compensates offset mismatches. The exemplary implementation can be 
further modified to compensate gain errors between the ADC paths as well. 
Based on the description herein, one skilled in the relevant art(s) will 
understand that the exemplary implementation illustrated in FIG. 9 can be 
modified in a variety of ways to compensate for gain errors. 

IV. Parallel Equalization 

[00129] In accordance with an aspect of the present invention, one or more 

types of equalization are performed in a parallel multi-path receiver. 
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A. Parallelization of a Viterbi Decoder 



□ 



[00130] In an embodiment of the present invention, Viterbi equalization is 

performed in a multi-path receiver. 
[00131] Parallel Viterbi decoders are described in, for example, Fettweis and 

Meyr, "Parallel Viterbi Algorithm Implementation: Breaking the ACS- 
Bottleneck," IEEE Transaction On Communications, Vol. 37, No. 8, August 
□ 1989, and Fettweis and Meyr, "High-Rate Viterbi Processor: A Systolic Array 

Solution," IEEE Transaction On Communications, Vol. 37, No. 9, August 
1990, both of which are incorporated herein by reference in their entireties. 

O 

%J [00132] In accordance with an aspect of the invention, Viterbi decoders are 
' parallelized by the DSP parallelization factor M. This allows the Viterbi 

f 53 * process to be run at a clock rate of fe/M, where fs is the symbol rate of the 

13 receiver. For example, for f B =3.125GHz, and M=8, the Viterbi processor 

£ would run at a clock rate of 390.625 MHz. The invention is not, however, 

l*U limited to this example. 

[00133] For a given number of decoder states S, the amount of hardware 
needed for the parallel implementation generally grows linearly with the 
degree of parallelization M. This allows large parallelization factors M to be 
implemented, and makes implementation of Viterbi decoders feasible at 
relatively high symbol rates. 
[00134] Parallelization is based on the idea of defining an M-step trellis (also 

with S states), which represents the state transitions after M symbol periods. 
Branch metrics for the M-step trellis can be computed using S "rooted 
trellises." Computation of the rooted trellises can be parallelized. 
[00135] FIG. 13 illustrates an example 4-state, 1-step trellis 1300 that runs at a 
clock rate substantially equal to the symbol rate, in accordance with an aspect 
of the present invention. 
[00136] FIG. 14 illustrates an example 4-state, M-step trellis 1400 that runs at a 
clock rate substantially equal to 1/M th of the symbol rate, in accordance with 
an aspect of the present invention. 
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[00137] FIGS. 15 A through 15D illustrate example rooted trellises, in 

accordance with aspects of the present invention. 
[00138] FIG. 16 illustrates an example systolic implementation of rooted trellis 

computation, in accordance with an aspect the present invention. 
[00139] FIG. 17 is a high level block diagram of an example parallel Viterbi 

processor in accordance with an aspect the present invention. 

V. Error Correction 

[00140] In an embodiment, the invention includes error correction processing. 
This processing can be done by the Viterbi decoder or elsewhere. Error 
correction processing includes, but is not limited to, hard-decision decoding or 
soft-decision decoding of convolutional, trellis, or block codes. 

VI. Methods of Operation 

[00141] FIG. 18 illustrates a process flowchart 1800 for implementing the 
present invention. For exemplary purposes, the process flowchart 1800 is 
described below with reference to one or more of the example system 
implementations illustrated in one or more of the drawing FIGS. 1-17. The 
present invention is not, however, limited to the example system 
implementations illustrated in drawing FIGS. 1-17. Based on the description 
herein, one skilled in the relevant art(s) will understand that the process 
flowchart 1800 can be implemented with other system implementations as 
well. Such other implementations are within the spirit and scope of the 
present invention. 

[00142] The process begins with step 1802, which includes receiving a data 

signal having a symbol rate. For example, in FIG. 1, a data signal 102 is 
received through transmission medium 112. 

[00143] Step 1804 includes generating N sampling signals having a frequency 

that is lower than the symbol rate, the N sampling signals shifted in phase 
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relative to one another. For example, FIG. 3A illustrates a timing recovery 
module 318, which generates N timing control signals 319-1 through 319-N, 
as illustrated in FIG, 3C The timing control signals 319-1 through 319-N 
have a lower frequency than the symbol rate of the received signal, and are 
staggered in phase from one another, as described above. 

[00144] Step 1806 includes controlling N analog-to-digital converter ("ADC") 

paths with the N sampling signals to sample the data signal at the phases. This 
is described above, for example, with respect to FIG. 3A. 

[00145] Step 1808 includes individually adjusting one or more parameters for 
each of the N ADC paths. Step 1810 can include, without limitation, 
individually adjusting each of the N sampling signals to reduce sampling 
phase errors in the N ADC paths, individually adjusting for offsets in the N 
ADC paths, and/or individually adjusting for gain errors in said N ADC paths. 

[00146] Step 1810 includes generating a digital signal representative of the 

received data signal from samples received from the N ADC paths. In FIG. 1, 
this is illustrated by the output digital signal(s) 106. 

[00147] Steps 1802 through 1810 are illustrated as discrete sequential steps for 

illustrative purposes. Steps 1802 through 1810 are not, however, limited to 
performance in discrete sequential steps. In practice, one or more of steps 
1802 through 1810 are typically performed in other sequences, and/or using 
feedback from the same step, and/or using input and/or feedback from one or 
more other steps. 

VII. Conclusions 

[00148] The present invention has been described above with the aid of 
functional building blocks illustrating the performance of specified functions 
and relationships thereof. The boundaries of these functional building blocks 
have been arbitrarily defined herein for the convenience of the description. 
Alternate boundaries can be defined so long as the specified functions and 
relationships thereof are appropriately performed. Any such alternate 
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boundaries are thus within the scope and spirit of the claimed invention. One 
skilled in the art will recognize that these functional building blocks can be 
implemented by discrete components, application specific integrated circuits, 
processors executing appropriate software, and the like, and/or combinations 
thereof. 

[00149] While various embodiments of the present invention have been 
described above, it should be understood that they have been presented by way 
of example only, and not limitation. Thus, the breadth and scope of the 
present invention should not be limited by any of the above-described 
exemplary embodiments, but should be defined only in accordance with the 
following claims and their equivalents. 
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